The Weighted Cfg Constraint 



George Katsirelos, Nina Narodytska and Toby Walsh 
University of New Soutii Wales and NICTA, Sydney, Australia, 



Abstract. We introduce the weighted CFG constraint and propose a propaga- 
tion algorithm that enforces domain consistency in 0(71' \G\) time. We show that 
this algorithm can be decomposed into a set of primitive arithmetic constraints 
without hindering propagation. 



1 Introduction 

One very promising method for rostering and other domains is to specify constraints 
via grammars or automata that accept some language. We can specify constraints in 
this way on, for instance, the number of consecutive night shifts or the number of days 
off in each 7 day period. With the REGULAR constraint [4], we specify the acceptable 
assignments to a sequence of variables by a deterministic finite automaton. One limita- 
tion of this approach is that the automaton may need to be large. For example, there are 
regular languages which can only be defined by an automaton with an exponential num- 
ber of states. Researchers have therefore looked higher up the Chomsky hierarchy. In 
particular, the Cfg constraint [8, 6] permits us to specify constraints using any context- 
free grammar. In this paper, we consider a further generalization to the weighted Cfg 
constraint. This can model over-constrained problems and problems with preferences. 

2 The weighted Cfg constraint 

In a context-free grammar, rules have a left-hand side with just one non-terminal, and 
a right-hand side consisting of terminals and non-terminals. Any context-free grammar 
can be written in Chomsky form in which the right-hand size of a rule is just one termi- 
nal or two non-terminals. The weighted Wcfg(G', W, z, [Xi, . . . , X„]) constraint holds 
iff an assignment X forms a string belonging to the grammar G and the minimal weight 
of a derivation of X less than or equal to z. The matrix W defines weights of produc- 
tions in the grammar G. The weight of a derivation is the sum of production weights 
used in the derivation. The WCFG constraint is domain consistent iff for each variable, 
every value in its domain can be extended to an assignment satisfying the constraint. 

We give a propagator for the WCFG constraint based on an extension of the CYK 
parser to probabilistic grammars [3]. We assume that G is in Chomsky normal form 
and with a single start non-terminal S. The algorithm has two stages. In the first, we 
construct a dynamic programing table V[i, j] where an element A of y [i, j] is a poten- 
tial non-terminal that generates a substring [Xi, . . . , Xi^j]. We compute a lower bound 
l[i,j, A] on the minimal weight of a derivation from A. In the second stage, we move 
from V^[l,n] to the bottom of table V. For an element A of ^[i, j], we compute an 



upper bound u[i,j, A] on the maximal weight of a derivation from ^ of a substring 
[Xi ,Xi+j]. We mark the element ^ iff Z , j , ^] < u[i,j,A]. The pseudo-code is 
presented in Algorithm 1. Lines 2-5 initialize I and u. Lines 6-16 compute the first 
stage, whilst lines 20-29 compute the second stage. Finally, we prune inconsistent val- 
ues in lines 30-31. Algorithm 1 enforces domain consistency in 0(|G|n^) time. 



Algorithm 1 The weighted CYK propagator 

1: procedure WCYK-ALG(G, W, z, [Xi, . . .,X„\) 



2: for J — 1 to n do 

3: for i — 1 to n — J + 1 do 

4: for each A e G do 

5: A] = z + l;u[i,j, A] = -1; 

6: for i — 1 to n do 

7: V[i,l] = {A\A^ a e G,a e D{X^)} 

8: tor A eV[i,l]s.tA ^ a eG,a e D(Xi)do 

9: l[i,l,A] = inin{l[i,l,A],W[A a]}; 

10: forj = 2tort do 

11: for i = 1 to n — j + 1 do 

12: v[i,j]=9; 

13: for — 1 to j — 1 do 

14: VliJ] = Vli,]] U {A\A BC e G,B e V[i,k].C e V[i + k,j - k]} 

15: foreach A ^ EC e G s.t. B G V[i, fc],C G V[i + k.j - k] do 

16: lli,J,A] = min{l[i,j,A], WIA EC] + k, B] + i[i + k,j - k, C]}; 

17: if S y[l,n]theii 

18: return 0; 

19: mark (1, S); ti[l, ri, S] = z; 

20: for j = n downto 2 do 

21: for i = 1 to n — j + 1 do 

22: for A such that (i, j, j4) is marked do 

23: for fc = 1 to J — 1 do 

24: for each A ^ BC E G s.t. Be V[i.k].C e V[i + k,j - k] do 

25: itW[A BC] + l[i. k. B] + l[i + k.j -k,C]> u[i,j,A] then 

26: continue; 

27: mark (z, k, B); mark (i + k, j — k, G); 

28: u[i,k, B] = max{M[i, k, B].u[iJ,A] - l[i + kj - k, G] - W[A BG]}; 

29: «[i + fc,i-fc, C] = max{tt[i + fe, J -fc,G], ti[i,j\ A] -/[i, fc, B] -H'[A — BC]]; 

30: for 2 — 1 to n do 

31: D{Xi) = {a e D(Xi)|A ^ a 6 G, (i, 1, A) is marfced and W[A ^ a] < u[i, 1, A]}; 

32: return 1; 



3 Decomposition of the weighted Cfg constraint 

As an alternative to this monoUthic propagator, we propose a simple decomposition 
with which we can also enforce domain consistency. A decomposition has several ad- 
vantages. For example, it is easy to add to any constraint solver. As a second exam- 
ple, decomposition gives an efficient incremental propagator, and opens the door to 
advanced techniques like nogood learning and watched literals. The idea of the decom- 
position is to introduce arithmetic constraints to compute / and u. Given the table V 
obtained by Algorithm 1, we construct the corresponding AND /OR directed acycUc 
graph (DAG) as in [7]. We label an OR node by n{i,j,A), and an AND node by 
n{i, j, k,A—^ BC). We denote the parents of a node nd as PRT(nd) and the children 
as CHD{nd). For each node two integer variables are introduced to compute I and u. 



For an OR-node nd, these are lo{nd) and uo{nd), whilst for an AND-node nd, these 

are ^(nd), UAind). 

For each AND node nd = n{i,j,k,A BC) we post a constraint to connect nd 
to its children CHD{nd): 

lA{nd)= lo{nc) + W[A^ BC] (1) 

noZCHD(nd) 

For each OR node nd = n(i, j, A) we post constraints to connect nd to its children 
CHD{nd): 

lo{nd) = min {UK)} (2) 

noeC'i?D(nd) 

uo{nd) = UA{nc), nc € CHD{nd) (3) 

For each Oi? node nd = n{i, j, A) we post a set of constraints to connect nd to its 
parents PRT{nd) and siblings: 

uo{nd) = maXn^^pRT{nd){uA{np) - loirish) - W[P\], (4) 

where P = B ^ AC or B — > CA, np = n{r, q, t, P) is the parent of nd = n{i,j, A) 
and n^b = n{ii,ji,C). 

Finally, we introduce constraints to prune Xi. For each leaf of the DAG that is an 
OR node nd = n{i,l,a),we introduce: 

a e D{Xi) ^ < lo{nd) < z (5) 

a i D{X^) ^ lo{nd) > z (6) 
lo{nd) > uo{nd) ^ a ^ D{Xi) (7) 

As the maximal weight of a derivation is less than or equal to z we post: 

uo{n{l,n,S))<z (8) 

Bounds propagation will set the lower bound of lo{n{i,i, A)) io the minimal weight 
of a derivation from A, and the upper bound on uq {n{i,j, A) ) to the maximum weight 
of a derivation from A. We forbid branching on variables Ia\o and w^io as branching on 
^A\o would change the weights matrix W and branching on would add additional 
restrictions to the weight of a derivation. Bounds propagation on this decomposition 
enforces domain consistency on the WCFG constraint. If we invoke constraints in the 
decomposition in the same order as we compute the table V, this takes 0{n'^\G\) time. 
For simpler grammars, propagation is faster. For instance, as in the unweighted case, it 
takes just 0{n\G\) time on a regular grammar. 

We can speed up propagation by recognizing when constraints are entailed. If lo (nd) 
> uo {nd) holds for an Oi? node nd then constraints (4) and (2) are entailed. If I a {nd) > 
UA{nd) holds for an AND node nd then constraints (1) and (3) are entailed. To model 
entailment we augmented each of these constraints in such a way that if lo{nd) > 
uo{nd) or lA{nd) > UA{nd) hold then corresponding constraints are not invoked by 
the solver. 



4 The Soft CFG constraint 



We can use the WCFG constraint to encode a soft version of Cfg constraint which 
is useful for modelUng over-constrained problems. The soft Cfg(G, z, [Xi, . . . , X„]) 
constraint holds iff the string [Xi , . . . , X„] is at most distance z from a string in G. We 
consider both Hamming and edit distances. We encode the softCFG(G, z, [Xi, . . . , 
constraint as a weighted Cfg(G', W, z, [Xi, . . . , X^]) constraint. For Hamming dis- 
tance, for each production A a G G, we introduce additional unit weight productions 
to simulate substitution: 

{A b, W[A ^b] = l\A^aGG,A^b^G,be E} 

Existing productions have zero weight. For edit distance, we introduce additional pro- 
ductions to simulate substitution, insertion and deletion: 

{A b, W[A ^b] = l\A^aGG,A^b^G,b€ IJ}U 

{A^e,W[A^e] = l\aeS}U 
{A ^ Aa,W[A ^ Aa] = l|a e i;}U 
{A aA, W[A aA] = l\a e S} 

To handle e productions we modify Alg. 1 so loops in Unes (13), (23) run from to j. 

5 Experimental results 

We evaluated these propagation methods on shift-scheduling benchmarks [2, 1]. A per- 
sonal schedule is subject to various regulation rules, e.g. a full-time employee has to 
have a one-hour lunch. This rules are encoded into a context-free grammar augmented 
with restrictions on productions [7,5]. A schedule for an employee has n = 96 slots 
represented by n variables. In each slot, an employee can work on an activity (a^), take 
a break (6), lunch (Z) or rest (r). These rules are represented by the following grammar: 

S RPR, fp{i,j) = 13 < j < 24, WbW, L IL\1, fUiJ) =j = ^ 
S RFR, fpli-.i) = 30 < j < 38, i? -> rR\r, W Ai, fw{i,j) = j > 4 
Ai aiAi\ai, fA{i,j) = open{i), F PLP 

where functions f{i,j) are restrictions on productions and open{i) is a function that 
returns 1 if the business is opened at ith slot and otherwise. To model labour de- 
mand for a slot we introduce Boolean variables b{i,j, ak), equal to 1 if j\h employee 
performs activity Uk at ith time slot. For each time slot i and activity Ofc we post a con- 
straint J2^=i (^k) > d{i, ak), where m is the number of employees. The goal is 
to minimize the number of slots in which employees worked. 

We used Gecode 2.0.1 for our experiments and ran them on an Intel Xeon 2.0Ghz 
with 4Gb of RAM In the first set of experiments, we used the weigh ted Cfg (G, Zj,X), 
j = 1, . . . , m with zero weights. Our monoUthic propagator gave similar results to the 
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unweighted Cfg propagator from [7]. Decompositions were slower than decomposi- 
tions of the unweighted Cfg constraint as the former uses integers instead of Booleans. 
In the second set of experiments, we assigned weight 1 to activity productions, hke 
Ai ^ Ui, and post an additional cost function X^Jli that is minimized. X^Jli 
the number of slots in which employees worked. Results are presented in Tablel. We 
improved on the best solution found in the first model in 4 benchmarks and proved op- 
timality in one. The decomposition of the weighted Cfg constraint was shghtly slower 
than the monolithic propagator, while entailment improved performance in most cases. 
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165 2856 


9042 


11450 
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92 


37 118 


12499 


92 59 


118 


6332 


92 49 


118 


10329 
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9 2 


6288 
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1377 


107 14 
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1282 
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76 
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3588 
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3588 
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V 


2 


10 


8 






3223 






3760 






8827 





Table 1. All benchmarks have one-hour time limit. |^| is the number of activities.m is the num- 
ber of employees, cost shows the total number of slots in which employees worked in the best 
solution, time is the time to find the best solution, bt is the number of backtracks to find the best 
solution, BT is the number of backtracks in one hour, Opt shows if optimality is proved. Imp 
shows if a lower cost solution is found by the second model 
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